Network-based spam filter on Twitter

نویسندگان

  • Ziyan Zhou
  • Lei Sun
چکیده

Rapidly growing micro-blogging social networks, such as Twitter, have been infiltrated by large number of spam accounts. Limited to 140 characters, Twitter spam is often vastly different from traditional email spam and link spam such that conventional methods of content-based spam filtering are insufficient. Many researchers have proposed schemes to detect spammers on Twitter. Most of these schemes are based on the features of the user account or the features in the content, such as the similarity or the ratio of URLs. In this paper, we propose a network analysis based spam filter for Twitter. By analyzing the network structure and relations between senders and receivers, this spam filter does not require large data collection up front, thus is able to provide almost real-time detection for spams. By using the public API methods provided by Twitter, our system crawled the users in the sub-graph between suspicious senders and receivers. Then we analyze the structure and the properties of the sub-graph and compare them with those we collected from legitimate senders and receivers. Our study showed that spammers rarely have a network distance of 4 or less from their victim. Using a sub-graph of diameter 4 constructed between sender and receiver, we can further improve the recall of our spam filter with promising results by utilizing network-based features such as number of independent paths and normalized page ranks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Twitter Content-Based Spam Filtering

Twitter has become one of the most used social networks. And, as happens with every popular media, it is prone to misuse. In this context, spam in Twitter has emerged in the last years, becoming an important problem for the users. In the last years, several approaches have appeared that are able to determine whether an user is a spammer or not. However, these blacklisting systems cannot filter ...

متن کامل

An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network

In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...

متن کامل

Sentiment Based Twitter Spam Detection

Spams are becoming a serious threat for the users of online social networks especially for the ones like of twitter. twitter’s structural features make it more volatile to spam attacks. In this paper, we propose a spam detection approach for twitter based on sentimental features. We perform our experiments on a data collection of 29K tweets with 1K tweets for 29 trending topics of 2012 on twitt...

متن کامل

An analysis of 14 Million tweets on hashtag-oriented spamming

Over the years, Twitter has become a popular platform for information dissemination and information gathering. However, the popularity of Twitter has attracted not only legitimate users but also spammers who exploit social graphs, popular keywords, and hashtags for malicious purposes. In this paper, we present a detailed analysis of the HSpam14 dataset, which contains 14 million tweets with spa...

متن کامل

On the Analysis of Twitter Spam Accounts in Saudi Arabia

Twitter spam accounts try to spread malicious content, deceive or advertise certain thoughts over Twitter network. Different approaches have been presented both in industry and academia to identify spammers on Twitter. This study aims at understanding the behavior of Twitter spam accounts targeting Saudi Arabia. In this study the author performs an empirical analysis of Twitter spam accounts in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014